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Abstract: We describe extensions to the SIESTA density functional theory (dft) code [3D], for 
the simulation of isolated molecules and their absorption spectra. The extensions allow for: 

• Use of a multigrid solver for the Poisson equation on a finite dft mesh. Non-periodic, 
Dirichlet boundary conditions are computed by expansion of the electric multipoles over 
spherical harmonics. 

• Truncation of a molecular system by the method of design atom pseudo-potentials of Xiao 
and Zhang[32 . 

• Electrostatic potential fitting to determine effective atomic charges. 

• Derivation of electronic absorption transition energies and oscillator strengths from the raw 
spectra produced by a recently described, order 0(N 3 ), time-dependent dft code[5T]. The 
code is furthermore integrated within siesta as a post-processing option. 

Key- words: multigrid solver, dft /tddft computation, molecular systems, siesta 
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Extensions du code DFT SIESTA pour la simulation de 

molecules. 



Resume : Nous decrivons les extensions au code siesta (3UI de la theorie de la fonctionnelle 
de densite (dft), pour la simulation des molecules isolees et leurs spectres d'absorption. Ces 
extensions permettent : 

• l'utilisation d'un solveur multigrille pour l'equation de Poisson sur le maillage dft Les 
conditions aux limites de Dirichlet sont calculees par un developpement en harmoniqucs 
spheriques du potentiel electriquc : 

• la coupure du systeme moleculaire a l'aide du pseudo-potentiels de l'atome sur mesure de 
Xiao Zhang[32j ; 

• le calcul des charges effectives atomiques par la methode de l'ajustement du potentiel 
electrostatique ; 

• Calcul des energies de transition d'absorption electroniques et des forces d'oscillateur a 
partir des spectres bruts obtenus par un code dft dependant du temps(2T]. Le code est en 
outre integre dans siesta comme une option de post-traitement. code|21|. 

Mots-cles : solveur multigrille, calculs dft /tddft systeme moleculaire, siesta 
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1 Introduction 

The SIESTA program is well established in the field of simulation of solids with density functional 
theory (dft)[3U]. It is used in an ever widening range of applications, benefitting from regular 
maintenance and extensions of the code's capabilities [51 125]. SIESTA employs numerical atomic 
orbitals (AO's) with strictly finite range, leading to order- TV scaling of computations with respect 
to the number of atoms, siesta was thus an attractive dft engine for a new, fast time-dependent 
dft (tddft) algorithm for molecular systems, based on use of dominant products of finite 
orbitals, and scaling as 0(N 3 ) with the number of atoms JV[3T]. Previously, this tddft code 
used orbital and density matrix data from files produced by SIESTA. Moreover, it provided only 
a raw spectrum. We have therefore undertaken to couple the program directly to siesta and to 
extract transition energies and oscillator strengths from the raw spectrum. Furthermore, siesta 
remains a periodic dft code, meaning that to simulate a molecular system, very large, essentially 
empty dft meshes may be necessary to effectively isolate the system from its periodic images. 

The present contribution therefore describes extensions of SIESTA useful in the field of molec- 
ular systems: 

1. Adaption to molecular computations by (i) Computation of Dirichlct boundary conditions 
in the Poisson equation, by development of electric multipolar moments, up to order 4 in 
spherical harmonics; and (ii) Introduction of a multi-grid solver for the Poisson equation; 

2. Introduction of an electrostatic potential fit algorithm for the assignment of atomic partial 
charges; 

3. Implementation of 'design atom' pseudo-potentials [32J, allowing truncation of a molecular 
system by replacing a bond by a tailor-made lone pair; 

4. Direct coupling to siesta of the order 0(N 3 ) tddft code fast, with extraction of transi- 
tion energies and oscillator strengths. 

Extensions 1-3, discussed in the correspondingly numbered subsections of part [2] are available 
as a set of patches of siesta, development version 431, downloadable at 

https://gforge.inria.fr/frs/?group_id=117££] The fast dft code is available at the same 

URL. 



2 Extensions of SIESTA 

2.1 Solution of the Poisson equation for non-periodic systems 

During self-consistency cycles in a dft computation, recourse is made at each cycle to the Hartree 
energy, 

E h = J p(r)V(r)dr, (1) 

approximated as a discrete sum over a mesh of points encompassing the system. The electrostatic 
potential, V(r), itself is formally a space integral of contributions of the electronic density, p(r), 

V(r)= f P (s) r ^-^ds. (2) 



1 Access to be opened on acceptance of the present paper 
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Figure 1 : Convergence of the physical properties of the water molecule with respect to the side 
of the simulation box: (a) Energy; (b) Electric dipole moment. Symbols : o, periodic simulation, 
using the standard FFT solver; □ and A, non-periodic simulation, using a second order, 7-point 
or a fourth order, compact 16-point stencil respectively for the Laplacian in the multi-grid solver 
for the Poisson equation. Data are plotted as differences with respect to the converged values 
(FFT, large box). See section 2.1 for simulation conditions. 



Rather than directly integrating the contributions of infinitesimal volume elements of the density, 
it is much more efficient to solve the Poisson equation for V(r): 

AV(r) = -A7rp(r)/e , (3) 

with suitable boundary conditions, where A is the Laplacian. We refer the reader to [3U| for 
details of implementation in siesta, in particular partition of the problem into neutral atom and 
bond contributions to p. 

Because siesta is a periodic code, in which the system and the DFT mesh (typically one crystal 
unit cell) are replicated indefinitely in all directions, a particularly effective solution of eqn. (J3j) 
is obtained by transforming to fc-space using fast Fourier transforms (FFT). This method is ideal 
for crystals, but for finite molecular systems the DFT mesh (periodic cell) must be made large 
enough to damp out interactions of the system with its periodic images. Physical properties 
may converge only slowly with the mesh size. Figure ([ljt,,b) illustrates this for a toy model of an 
isolated water molecule. Although the molecule and its orbitals hold within a cube of side 5.7 A , 
a dft mesh of side 20 A is required to achieve good convergence of the energy (« — 465.8 eV) 
and dipole moment (w 1.389 D) in the periodic FFT computation. 

We refer repeatedly to this model, using the standard double-zeta with polarisation basis 
set of SIESTA (dzp, energy shift parameter 0.02 Ry), in the local density approximation (lda, 
Ceperley- Alder exchange-correlation j25| ) . The mesh cutoff is 400 Ry. Heuristics in SIESTA cause 
the corresponding mesh step to vary slightly with the box size, in the range 0.08 ± 0.002 A m 
all calculations. The tolerances for convergence of the density matrix and energy are 10~ 6 and 
10 -6 eV respectively. Standard Troullier-Martins pseudo-potentials from the SIESTA library are 
used throughout. The model is used at the optimised geometry. 
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In fact, the solution of eqn. ([3| for finite systems is determined uniquely by the values on a 
closed surface enclosing the charge density i.e. Dirichlet boundary conditions. The DFT mesh for 
a finite system can thus be made quite small, provided there is a recipe for accurately specifying 
the solution on the surface of the dft mesh. 

A molecular computation thus needs; (i) an independent means of specifying the Dirichlet 
boundary conditions at the surface points of the finite dft mesh; (ii) an efficient solver for the 
boundary value problem of V(r) in the interior of the domain. 

Dirichlet boundary conditions by expansion of multipoles over spherical harmonics. 

Dirichlet boundary conditions may be determined by either of two alternative multipole expan- 
sions of || r l s || hi eqn (pi: Cartesian multipoles or spherical harmonics. Cartesian expansion in 
the inverse distance[19 is easy and cheap to compute up to second order, p = 2, but tedious to 
pursue to higher orders. Furthermore, the number of terms grows as p 3 . Spherical harmonics 
cost more to evaluate, but are readily extended to any order. The number of terms grows only 
as (p+ l) 2 and their complexity is lower than in the Cartesian approach. We have implemented 
expansion over spherical harmonics in SIESTA. The electrostatic potential on the surface of the 
simulation box is approximated by an expansion up to order p as follows [18J: 

p 1 (—X\ m 
V(r)*V p (r) = E ^rAOT m (M), 

1=0 m=-l 

Mr = f d 3 sp(s)s l Y l - m (s), 

where (r, 9, 4>) are the spherical coordinates of point r. In practice, the electronic density p is 
known on the DFT mesh points, and non- vanishing only within the supports of the finite numerical 
orbitals employed in SIESTA. We allow only orthorhombic meshes for the multi-grid solver, since 
an arbitrary mesh is pointless for computation of an isolated molecule. Therefore, although we 
develop the potential over spherical harmonics on the boundaries we express them in Cartesian 
form. Setting cf>(s) = s'Fj~ m (s) at mesh point s, we approximate M, m by 

a\p(s)jtO 

where <f>(s) is computed only at mesh-points with non- vanishing density, i.e. above a user-defined 
threshold. The volume element of the mesh is dV. 

The multipole expansion converges formally for all r outside the support of p(s). Accurate 
Dirichlet conditions may be obtained either by: increasing the size of the dft mesh box, im- 
proving in eqn. ^ the separation between surface points r and charges at interior points s; or 
by increasing the order of the expansion. In dft computations, intensive use of the mesh points 
drives the balance towards higher order expansion on a smaller mesh. 

If the supports of the numerical orbitals impinge on the surface of the dft mesh, accuracy is 
certainly compromised. In our implementation, we use the atomic coordinates, and the radii of 
the atomic orbitals to check for this problem. However, it is always advisable to perform a series 
of exploratory calculations with different box sizes and multipole orders, to prove convergence of 
computed properties with respect to errors in the expansion. 

As can be seen from figure [T] both the total energy and the dipole moment of water are highly 
converged when boundary conditions are computed with fourth order spherical harmonics in a 
box of only 7 A , or a clearance of barely 0.65 A around the atomic orbitals used to develop p{r). 
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Figure 2: Dependence of the properties of the water molecule on the mesh step in the finite model 
with the second order MG solver, box side 7A. Data are shown as differences with respect to 
the well converged FFT solution of a large periodic model (box side 40 A), (a) Energy (ref. - 
465.800453 eV); (b) Dipole moment (ref. 1.3887 D). 



Multi-grid solver for the Poisson equation. Solution of the Poisson equation on a regular 
finite mesh proceeds by discretizing the Laplacian operator by finite differences. One obtains a 
system of linear equations. A^V = /, for V at points in the interior of the domain. Vector / 
contains the Dirichlet conditions. The sparse, symmetric, positive definite matrix Ah has the 
classical stripe pattern described by a 'stencil'. A 7-point second order stencil and a compact, 
16-point fourth order solver have been included in the present implementation. The matrix size 
is N = Ni x N 2 x N 3 where Ni — Li /hi is the number of mesh points in the z-direction and hi 
is the mesh step in i-direction. 

Iterative methods are attractive for solving the Poisson equation inside the SCF loop, rather 
than direct methods like Cholesky factorization. It is well known that multi-grid solvers are the 
fastest iterative methods for solution of the Poisson equation in a rectangular box. The complex- 
ity is linear in the system size, even better than for a periodic solver based on FFT. Reference|8j 
discusses multigrid methods (MG) in detail. One efficient parallel multigrid software package is 
the HYP RE software library |13l HI |2]. Here, we wrap this general library to make it available 
within siesta. We use the structured-grid interface and either PFMG, a semicoarsening multi- 
grid solver that uses pointwise relaxation, or preconditioned conjugate gradients (PCG)[12 . The 
PFMG solver allows different parallel smoothers, including the red-black Gauss-Seidel method. 
The PGC method uses PFMG as a preconditioner. Our wrapper is very flexible and can be 
extended easily to more complex operators like higher order discretizations of the Laplacian. 
Different kinds of mesh distribution are also available. We use classical, uniform 2-D real-space 
domain decomposition and the new SIESTA parallelization scheme based on balanced real-space 
domain decomposition, achieved with a recursive bisection algorithm|28 . Ideally, an irregular or 
even unstructured mesh would adjust to the gradient of the electronic density, leading to a finite 
element formulation of the problem, a future refinement possible within the hypre library. 
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Performance. Implementation of the HYPRE MG solver was checked by generating the Dirich- 
let conditions for distributions of point charges, numerically solving the Poisson equation in empty 
space and checking the solution against the known analytical result. The embedding of the MG 
solver in SIESTA was checked by comparing to results with the standard FFT in SIESTA, on the toy 
model of water mentioned above. In what follows we found it sufficient to expand the Dirichlct 
boundary conditions over spherical harmonics up to order four. 

Figure |l^,b) shows, for the same mesh step in the FFT and MG calculations (0.08 A or plane 
wave cutoff of 400 Ry) , the superior convergence of the physical properties with the finite box 
size, achievable with the MG solver and Dirichlet conditions, compared to that of the solution 
under periodic boundary conditions, with the FFT solver. For the given mesh step, the energy 
obtained with the second order MG solver is 5 meV higher than the FFT solution, and the dipole 
moment 0.1 mD larger. Indeed, the final accuracy of the dft computation depends on the step 
of the DFT mesh and on the order of discretization of the operator. The energy obtained with 
the second order discrete Laplacian operator is converged with respect to box size, but to a value 
significantly different from the FFT result. The fourth order approximation, using a 16-point 
stencil, on the other hand, gives the same results as the FFT calculation. Figure ([^jb) confirms 
convergence of the MG solutions to the FFT values as the step size is reduced, for the second 
order solver, where the effect is most significant. Figure 1 of the supplementary information(SI) 
shows the linear scaling of the algorithm with the system size. 

2.2 Electrostatic potential fit of partial atomic charges 

Often it is desirable to assign charges to atomic centres, to be used as proxies for the full electronic 
density p(r). Such schemes, e.g. Mulliken charges and 'atoms in molecules' analysis[31| , each 
have their own advantages and disadvantages. In classical molecular dynamics, partial atomic 
charges 'best fit' to represent the Coulomb forces would be desirable. In practice, 'best fit' 
representation of the Coulomb potential itself is much more tractable analytically and more 
common than fitting the gradient of the potential. 

We therefore have added such an electrostatic potential (esp) fit routine to siesta. This is 
a classic problem [71 [T51 125] . Here we give the minimum details to describe our implementation. 
The problem is to determine a set of partial atomic charges qi,i — 1, N on N atomic centres 
Ri, such that their Coulomb potential is a good approximation to the full molecular potential 
at a set of test points Sj,j = 1, . . . ,M, commonly chosen near the molecular van der Waals 
surface. Let Uj be the full molecular electrostatic potential at Sj: 



where Zi is the nuclear charge on atom i (in the case of pseudo-potentials, the valence charge). 
These values are to be approximated by 




JV 




where the partial charges qi are determined by least squares minimisation of the error 



M 



x 2 (M) = E(^-^) 2 - 
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As usual, we add via a Lagrangian multiplier, A, the constraint that the qi should sum to the 
total charge of the molecule, Q. The problem is then to find stationary points (minima) of the 
Lagrangian 



N 



£({%},A) = x 2 ({%}) + A 



After a little algebra one finds the qi (and multiplier A) as solutions of an (N + 1) x (N + 1) 
linear system, the matrix equation 

(4) 



with 
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£ is the NxM matrix with elements = 1/ Sij , with s 



\Ri 



It long has been known [T6] that matrix A' may be ill-conditioned or singular, depending on 
the choice of the test points Sj , because the far field of a set of charges is determined principally 
by their lowest order multipoles. In our implementation we use as Sj's a subset of the DFT mesh 
points, since the full electrostatic potential is known already on the DFT mesh at the end of the 
SCF cycles. Similar to a Connolly surfacejTT], the mesh points chosen are those at least at a 
distance R mm from all nuclei, and at most i? m i n + SR from some nucleus (the 'skin' thickness 
5R being in practice ps 0.1 A). Our experience with the finite support AO's in siesta is that 
ESP charges are physically sensible and relatively insensitive to i? m i n when it is chosen from just 
greater than the largest AO radius, up to around 5 A beyond. For larger values, matrix A' may 
indeed become singular. Typical mesh sizes generally lead to several thousand points within the 
skin, so that the main cause of any indetermination of the charges is the singularity of A' when 
all the points are chosen too far away from the molecule. 

We solve equation ^ using a truncated SVD algorithm, starting with QR factorization of 
A' as A' = QR, where Q is an orthogonal matrix and R is upper triangular. Singular value 
decomposition of R yields R = UWV T , where U and V are orthogonal and W is diagonal 
(diagonal elements Wi). Small singular eigenvalues, signifying singularity of the matrix and 
indetermination of the qi, are removed by setting Wi — if \wi/w max \ < e. We standardly choose 
e = 10~ 9 . The solution, q' , of eqn. fil is then 



VW 



- 1 U T Q T b'. 



Routines from the BLAS|23| and lapack|5J libraries are used here to perform these operations. 
The calculation is parallel. 

Figure ^ shows the ESP charges found for polar groups in indigo, as a function of the ESP 



2.1 



shell radius i? m i n (5R = 0.1 A ). Conditions are the same as for the toy water model in section 
The Poisson equation was solved with the multi-grid method with Dirichlet boundary conditions 
from spherical harmonics up to order 4. It will be observed that the ESP charges are insensitive 
to R m in once the sample points are all outside the largest atomic orbital (radius ~ 2.6 A). The 
charges on C,0,N and H are 0.28, -0.34, -0.42 and 0.29e . 



2.3 Truncation of molecules with design atom pseudo-potentials 

It often is desirable to restrict the size of a quantum mechanical calculation to just the essential 
atoms for the problem at hand. Quantum mechanics/molecular mechanics (qm/mm) calculations 
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1.5 2 2.5 3 3.5 4 4.5 
SKIN RADIUS (A) 

Figure 3: ESP charges on the polar groups C=0 and NH in indigo (inset), as a function of the 
radius of the 'skin' of dft mesh points used in the ESP fit. The vertical line shows the largest 
atomic orbital radius. 




Figure 4: Truncation of a large molecule in a QM/MM model by cutting a bond and completing 
the dangling valence with an electron to form a lone pair mimicking the missing bond. 
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Figure 5: Comparison of design atom and standard Troullier-Martins pseudo-potentials for 
the 2s (thin) and 2p (thick lines) shells of carbon. Black: standard Troullier-Martins method, 
blue: design atom method, this work. Vertical lines: core radius, (a) Wave functions, showing 
squeezing from outside to inside the core to maintain the valence density of standard carbon; 
(b) Ionic pseudo-potentials, showing deepening in the core to accommodate the extra electronic 
density. Points: reference data kindly provided by Y. Zhang. 



are a prime example, as are cluster calculations in which a cluster of atoms of tractable size is 
cut out of a periodic material. In both cases it is necessary to correct dangling valencies left 
by the truncation, to reduce perturbation of the core region of the cluster. One way is to add 
capping atoms (usually hydrogens) to complete peripheral valencies. 

Another is to turn dangling bonds into lone pairs, an approach developed by Xiao and 
Zhang |^] with a view to Qm/mm calculations, cf. figure Perturbation of the core of the 
cluster is then minimised by transforming the peripheral atom so capped into a 'design' atom, 
with specific pseudo-potentials to mimic the electronic structure of the atom it replaces. We 
have extended this method to truncation on oxygen and applied it to larger systems, where we 
find the perturbation decays rapidly with distance from the cut bond. 

Pseudo-potentials may be produced and tested with the help of the ATOM program distributed 
with siesta. We coded 'design' Troullier-Martins, norm conserving pseudo-potentials in atom. 
In the design atom approach for carbon, one adds an electron and increases the nuclear charge 
by one unit. But the design atom is not just nitrogen, since to minimise discrepancies between 
the full and the truncated molecule, it is further required that the electronic density of the design 
atom outside the core matches that of carbon. This constraint is achieved by rescaling the wave 

1 /2 

functions outside the core radius, per angular momentum channel, I, by r\i — [Ni/Nf) , where 
Ni is the number of valence electrons of the original atom in shell I and Nf that in the same 
shell of the design atom. 
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Figure 6: Use of a design atom oxygen to truncate a benzyl carbamate model of a graftable dye: 
(a) Structure, indicating the bond cut and replaced by a lone pair; (b) Errors in the positions of 
atoms in the truncated molecule with respect to the full molecule (numbering as in (a)). 



Thus, for carbon, rj 2p = y/2/3, and one expects r] 2s — 1. Compared to standard carbon, the 
radial 2p valence wavefunction of design carbon is therefore depressed outside the core radius, 
and exalted within, to accommodate the extra electron, figure ([5p). In fact Xiao and Zhang 
adjusted 772s, optimising it with respect to the geometrical parameters of their target molecules, 
yielding in their case 772s = 0.91. Our implementation closely follows ref. [32], and indeed our 
design carbon pseudo-potentials agree very closely with those of Xiao and Zhang, see figure ((5b). 



We extended the design atom approach to truncation on oxygen. In this case, r]2 P — \/4/5 ~ 
0.89. We varied 772s in a series of truncations on the oxygen atom of the toy benzyl carbamate 
in figure (|6p), similar in structure to graftable photosensitizers of reactive oxygen species of 
interest to us[22 . Here, we show results for 772s = 1; reducing it deteriorated the results. Figure 
(|6b) compares the geometries of the full and the truncated forms, both fully optimised (lda, 
standard DZP basis). The important part of the molecule is the phenyl ring, standing in for the 
chromophore. Our measure of quality is therefore to bring three phenyl carbons (atoms 4,1 and 3 
in fig. |6^), of both the full and the truncated molecules, respectively to the origin, on to the Ox 
axis and into the Oxy plane, and to compute the distances between corresponding atoms, shown 
in fig. ([6b ) as a scatter plot of error vs. distance to the oxygen atom before truncation. It will 
be observed that the error of placement of atoms in the fragment relative to the full molecule, 
drops off fast as a function of the distance from the design oxygen, being under 10 -2 A for those 
in the ring. Complete Z-matrices are provided in the supplementary information. 
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2.4 Linear response tddft 
Context 

A new, order 0(N 3 ) tddft code was described recently [141 [T51 |2"T] . exploiting the strictly finite 
range of the numerical atomic orbitals in siesta. The optical absorption of a molecule, for a light 
electric field at angular frequency ui, polarised in the a direction (a = x, y, z), is proportional to 
the complex part of the polarisability 

P a (u) = J d 3 rd 3 r'r aX {r,r',uj)r' a , (5) 

where x (r, r', w) is the susceptibility, and we use the fact that molecules are much smaller than 
the wavelength. The susceptibility x is expressed in terms of that of the non-interacting electrons 
in the Kohn-Sham approach, Xo> and the Hartree and exchange-correlation kernel S: 

xM = (i-xoMS)" 1 xoM. (6) 

The non-interacting electron response function xoi reads in terms of Kohn-Sham orbitals: 
Xo(r,r',u) = 4>E(r)ip F (r)ij) F {r')ij} E (r') 

E<0,F>0 

X (u-(E-F)+je ~ w + (E-F)+je) ' (?) 

where the sum runs over transitions between filled and empty orbitals with energies E and F, 
j = — 1, and e is a regularisation parameter. Since the orbitals tpE may be expressed as linear 
combinations of atomic orbitals (AO's), this form of xo exhibits dependence on AO pair products. 

References |14l ITS] point out the high degree of linear dependence in the AO product space 
and the means to drastically reduce it by expressing AO products as linear combinations of 
dominant products found by diagonalisation of an appropriate metric. Paper |21j avoids explicit 
inversion in eqn. ^ and introduces an efficient parallel solution of the relevant equations: 

3 

P(Lj) = J2<di,Xi(u) > 

(8) 

(l-xV)E)*i(u)=xVH> i = 1,2,3 

where di is the dipole in the i-direction. Each linear system is solved by the Krylov GMRES 
methodpB]. Solution of eqns. ^ at a set of frequencies u leads to a raw absorption spectrum. 

Note that because e is a regularisation parameter, the raw spectrum for a particular value 
of e has no absolute physical meaning. Figure ^ shows how, given sufficient points in the raw 
spectrum, close resonances are distinguishable as the regularisation parameter is reduced. For 
convenience of representation, we plot eP e (oj) rather than P e (w), which diverges at resonances as 
e — > 0. Finding with any certainty even just the main resonances in a given frequency interval 
[wmm,Wmai] would seem to require making e very small and using a very large number of points, 
of order (uj max — Lo min )/e. However, quite apart from the computational cost, it is ineffective 
to try to separate close resonances by brute force reduction of e and increasing the number of 
frequency points oj. Close to resonances of the free electron response, cj ~ E — F in eqn. |7]), 
Xo diverges as 1/e and ill-conditioning may prevent convergence of the GMRES method. We have 
observed slow convergence, or absence of convergence, or even negative polarisabilities at some 
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Figure 7: Absorption spectra of indigo computed with regularly spaced frequency points and 
decreasing regularisation parameter e=0.05, 0.01, 0.005 and 0.001 Rydberg, from top to bottom. 

frequency points when e is reduced below 5 x 10 -4 Ry. There is no clear way to preconditioning 
the linear systems so some other strategy is needed. 

Furthermore, the method as it stands has other drawbacks: (i) The shape of the spectrum 
depends on the regularisation parameter e. When it is too big, a weak transition may go unnoticed 
in the wing of a strong one; but making it too small is wasteful, with most values of P(lu) 
very small compared to a handful of large values at the resonances; (ii) Little can be done to 
identify the nature of the transitions observed in the spectrum, since not even oscillator strengths 
are extracted; (Hi) The tddft computation is run after SIESTA, requiring geometrical, orbital, 
Hamiltonian and overlap data to be communicated in a disk file. 

Present improvements 

An immediate step to improving separation of resonances is to deal separately with each polari- 
sation in eqn. ([8]), since transitions often have different polarisations. Here we further improved 
the computation in several ways. 

First, addressing point (Hi), the 'fast' tddft calculation can be invoked now from within 
SIESTA, the 'move' loop of the SIESTA main program. This is achieved by coupling fast directly 
to siesta using the MPICPL (MPI Coupling) framework^], mpicpl is dedicated to the coupling 
of scientific codes, based on the well-known MPI standard. It is divided into several independent 
layers for coupling, data redistribution and steering. The codes to be coupled are launched and 
connections between them are set up by mpicpl, according to information derived from an xml 



Second, to address points (i) and (ii), note that the expected form of the spectrum is a sum of 
Lorentzian resonances. By fitting the parameters of these Lorentzians to the numerical spectrum, 
we can in most cases identify the transitions without recourse to small e and very fine combs of 
u> values. We relate the peak heights of the numerical spectrum to the oscillator strengths of the 
transitions by considering that in the linear response regime, electronic transitions respond to 
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the driving field like independent harmonic oscillators (24] , so that the susceptibility 

x(w) = x , M + ix"(w) 

can be written in atomic units as 

xM = E 



// 



where fij, /j and Tj are the frequency, oscillator strength and damping (homogeneous linewidth) 
of transition I. Here, we need the complex part of the susceptibility, x"(u;), which after a little 
algebra can be expressed as a sum of terms of the form 

2ioTifi 

(fir - w) 2 (fi 7 + u) 2 + 4lj 2 T 2 ' 

Close to a resonance (elsewhere the susceptibility is negligible) , ui ~ fij, and fij + w ~ 2u ~ 2fi/, 
so that 

x (w) ~?(fi7^F+rf- 

Identifying the regularisation parameter e in the linear response tddft with the damping L/ , 
we see that the numerical spectrum should be representable as a sum of normalised (unit area) 
Lorentzian resonances, 

where the weights Cj are related to the oscillator strengths by 

2fi/ 7T ' 

In eqn. ([9]), B represents the more or less flat contribution of resonances outside the frequency 
range [uj m - nl , w max ] where the raw spectrum was computed. We perform a non-linear least squares 
fit of the parameters C7, fi/ and B to minimise the residual: 

^freq 

x 2 = — E OCoddH-W,) 2 , (io) 

Jv freq . =1 

using the Levenberg-Marquardt method from the GNU scientific library [T] and iVf req frequency 
points ujj in the numerical tddft spectrum, with intensities Ij. 

Fitting a Lorentzian requires at least four data points. If Nl resonances are expected in 
the frequency range [w m i n , w max ], the raw spectrum should comprise at least -/Vf rcq = Ni = 4:Nl 
points. A reasonable trial e would be e = e\ = (w max — w m in) /N\. The number of certain 
resonances and their positions are determined by an inspection algorithm detecting local maxima 
of the spectrum from adjacent regions of increasing or decreasing values. Clearly, the smaller 
the value of e, the sharper the resonances will become, and the greater will be the number of 
distinguishable resonances. But because of the caveat noted above and of the computational 
cost, e should not be made too small. 

The question now is how to best choose the ujj . In practice, data in the wings of the resonances 
contribute less to the accuracy of the Lorentzian fit than do points close to the peaks. Since 
the resonances are at first unknown, one must start with a uniform distribution of u>j but it 
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Figure 8: Behaviour of the integral function J(ui) used to to concentrate tddft computation 
points tjj around the resonances of the current best estimate of the spectrum (in this case the 
top spectrum in figure |7| . 



is possible to continue with an iterative, adaptive procedure, to cluster the points around the 
current best approximations to the resonance frequencies. 
With this purpose, define a normalized integral function as 



J{uj) = I P(uj)duj J I P(uj)duj . (11) 

^min / -^min 

As P is a positive function, J increases monotonically (from to 1) as w sweeps the interval 
[w m j n , w max ] . In practice, J {of) increases step- wise at each resonance, the smaller the value of e, 
the steeper the steps and the flatter the plateaux between the steps, see fig. pj. Therefore, we 
can easily construct a local inverse function where P is non zero. The trick is now to use the 
inverse function to map a set of regularly spaced values of J into a set w's clustered around the 
resonances. Since at any time we know only a finite number of (u>i, J(uji)) pairs, the local inverse 
function is defined via linear interpolation: 

Japprox(Wj = J(Ui) (W-WiJ LO G [£t)<, LOi+l\- (12) 

Figure (|8| illustrates this procedure for the topmost spectrum (e = 0.05 a.u.) in figure Q. 

The complete algorithm is provided in the supplementary information, with results for uni- 
formly spaced data. Briefly, we start with a uniform distribution of points and iterate the 
clustering around the current best estimates of the resonances. At each stage of the procedure, 
the number of certain resonances to be used in the fit is determined by inspection of the numer- 
ical spectrum, and the inverse integral function is used to improve the distribution of the points 
around these resonances before refitting the Lorentzians. After iterating the improvement of the 
distribution of points, the algorithm may decrease e or increase the number of points, or do both. 

Application to indigo 

Before going into more detail about the choice of adaptive strategies in the algorithm, figure 
d9l validates the present approach. It shows results for a set of substituted benzenes. The 
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Figure 9: Comparison of transitions obtained with the present adaptive algorithm (LDA, numer- 
ical DZP basis, vertical axis) to those obtained by solution of Casida's eigenproblem equations ( 
LSD, 6-31G*, horizontal axis), for a series of substituted benzenes, (a) Transition energies; (b) 
oscillator strengths. Symbols : □ aniline; o anisole; A dimethylaniline; y phenol; o phentole; + 
thiophenol; x toluene . 



figure compares transition energies and oscillator strengths obtained here (lda, standard DZP 
valence basis and pseudo-potentials in siesta) to those obtained with a similar level of theory 
(lsd, 6-3 1G* all electron basis) by solving Casida's eigenproblem casting of linear response 
theorv[TUl |2"U1 [T7j. Calculations were performed on the same geometries, optimised in siesta. 
Despite differences in the basis sets, the agreement is good, including even most of the weaker 
transitions. 

Table [T] illustrates four ways to use the adaptive fitting to improve the accuracy on the 
transitions obtained for the visible spectrum of indigo in the range 0.02 to 0.4 Ry, computed with 
the same DFT conditions as in section |2.2| The final value of the regularization parameter is in 
each case e = 10~ 3 Ry. Table [l] shows two strategies. Cases 1-4 illustrate constant e (10 3 Ry) 
and variable placing or numbers of data points. In case 5, both e and the number of data 
points are varied, the latter being determined by e. The idea, here is to use fewer points placed 
depending on the regularized parameter. We summarize below the parameters of the different 
strategies : 

Test 1 Here, the number of points is constant and we iteratively improve the point place- 
ment. Considering that resonances will not be separable if closer than e, up to 

-^trans ~ (Wmai — W lnax )/2e (13) 

resonances could in principle be distinguished, given 4 points for each. The minimum 
number of points required is thus: 

3 

Nfrgq = 1 + — {LO max — LU min ). (14) 

Test 2 and test 3 are the same as test 1, but use fewer points, respectively 285 and 101. 

Test 4 starts with the same small number of points as test3, and at each iteration Nf Teq is 
increased by 25%. 
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Test 1 


test 2 


test 3 


test 4 


test 5 


e mitial ( Ry ) 
£ final ( Ry ) 


1 x 10" 3 
1 x 1CT 3 


1 x 10" 3 
1 x 10~ 3 


1 x 10" 3 
1 x 10" 3 


1 x 10" 3 
1 x 10~ 3 


2 x 10" 2 
1 x 10~ 3 


Arinitial 
JV freq 
Arfinal 
iV freq 


571 
571 


285 
285 


101 
101 


101 
198 


50 
255 


Niter 
NevaX 


2 

1142 


2 
570 


5 
505 


4 
583 


5 
663 


^trails 

x/Pla 


7 

3.0 x 10~ 4 
2.2 x 10~ 4 


7 

3.4 x 10~ 4 
2.1 x 10~ 5 


4 

1.7 x 10~ 3 
1.0 x 10~ 4 


6 

5.9 x 10~ 3 
8.9 x 10~ 4 


7 

4.2 x 10~ 4 
6.2 x 10~ 4 



Table 1: Statistics of different iterative strategies to improve the fitting of Lorentzians to the 
numerical tddft spectrum of indigo. Nj£^ , N^.^ : initial and final numbers of data points; 
iViter : number of iterations to reach an error of 5.0 x 10 -2 Ry ; iV eva i : total number of evaluations 
of P(ui) at convergence; A'trans : number of transitions found; x 2 : the residual defined by eqn. 
(10 1. <5/ max is the maximum variation of the oscillator strengths between the last two iterations 



at convergence. 



Test 5 starts with e — 2x 10 -2 Ry and iVf req = 50 and e is reduced in equal steps to 10 -3 Ry 
in two iterations. At each iteration the number of points increases by 25 % in order to have 
enough points for the fit algorithm. 

Table [TJ shows the consequences of these strategies. Test 1 represents a brute force approach 



But most real molecules will have far fewer transitions than implied by eqn (13). Accordingly, 
tests 2-3 show that the number of points can be reduced without loss of information, but that 
eventually (test 3) some weak transitions are missed. These are recovered in test 4, where data 
points are added, but only as necessary. The number of evaluations of P(u>) is half that required 
by brute force, for a comparable result. Test 5 achieves an even better result, at the expense of 
slightly more function evaluations, by decreasing e. 

Figure [10] shows the spectrum and the fits obtained in tests 4 and 5. Resonances found in the 
tests agree very well, to within about 10 -5 Ry (4 cm -1 ) for the resonance frequency and about 
0.1 % for the oscillator strengths. These results illustrate a further benefit of the fitting procedure, 
that, using a moderate e (10 -3 Ry), it achieves an accuracy that could only be achieved by brute 
force with much smaller e, for which the tddft algorithm would be numerically unstable. The 
energy and oscillator strength of the first excited state of indigo are 0.1515 Ry (or a wavelength 
of 602 nm in vacuum) and 0.2, values which could certainly be improved (the experimental 
transition is at 0.136Ry or 546nm[27_), should hybrid functionals become available in siesta. 



3 Concluding remarks 

We have described several extensions of the widely used SIESTA program, aimed at making it 
more directly applicable to molecular systems. Some closing words may be appropriate on the 
methods chosen. 

Self polarisation in the periodic model of an isolated water molecule was small. However, 
larger effects can be expected for systems which are both polar and contain tt electrons, such as 
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Figure 10: The UV-visible part of the absorption spectrum of indigo; semi-log scale to highlight 
strong and weak resonances. Data points : numerical spectrum; thin line : the Lorentzian fit of 
test 4 misses the highest energy resonance, which is recovered in test 5 (thick line). 



many common laser dyes, in which a degree of intramolecular charge transfer is present, even in 
the ground state. Large dipole moments are common in laser dyes, which moreover may contain 
20-40 heavy atoms. Simulation boxes big enough to effectively separate such systems from their 
images could become uncomfortably large. The present multi-grid solver for the Poisson equation 
with Dirichlet boundary conditions meets this need, in a framework that should make further 
improvements possible, such as use of finite elements to represent the usually inhomogeneous 
electronic density. 

Electrostatic potential fitting is an old problem, with a vast literature. The algorithm in- 
troduced here in SIESTA is very simple. While it so far has worked satisfactorily for molecules 
of interest to us, general users should be aware that it may produce unpredictable results on 
systems with 'buried' atoms. The algorithm here has also been adapted to fit the electrostatic 
potential computed in siesta in periodic systems. See however Campaha et al.\§\ for a lucid 
discussion of the problems associated with charge fitting in such systems. 

The design atom approach to truncation of molecules does not appear to have been taken 
up in the literature, possibly because Xiao and Zhang exhibited significant perturbation of the 
truncated moieties of small molecules [32J. Indeed, we too found for example that dihedrals close 
to the design atom may be in error by 10-20°. Yet, as shown here, the perturbation in fact 
decays rapidly with distance from the design atom, making the method attractive for truncation 
of large systems. 

It should also be mentioned that the lone pair of the design atom may give rise to spurious 
n — 7r* transitions towards a conjugated region of the truncated molecule. They appear as weak 
transitions in the low energy wing of the spectrum, with intensities falling off exponentially with 
the distance to the design atom. This easily identified effect is tolerable in large molecules, in view 
of the gain in computational cost brought by truncation. A typical case would be truncation in 
quantum mechanical/classical mechanical simulations, where a number of other approximations 
might be be more serious than the presence of these weak but identified transitions 

It may be useful to put the linear-response tddft method in refs. |14|[T51l2T] and the present 
improvements in perspective with commoner approaches, such as for example Casida's widely 
implemented equations PHI |2U|. The point is that whereas solution of Casida's equations yields 
at significant cost a list of transitions and their properties, the linear response tddft method 
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produces a piece of the raw spectrum, at low cost, but with little physical insight, such as orbital 
symmetries and energies, transition polarisations and oscillator strengths. 

The present extensions, by efficiently extracting transition energies, oscillator strengths and 
a degree of polarisation information, should be useful to identify transitions by comparison with 
a one-off solution of Casida's equations for the same system. The strength of the present method 
would then be to allow cheap, repetitive calculations to study how the transition reacts to 
multiple perturbations of the geometry, or a varying external field, because P(uj) then needs to 
be computed only in narrow frequency intervals bracketing the interesting resonances. Indeed, 
useful experimental data are in the vast majority of cases restricted to the first one or two 
strongest excited states only. For example, the first transition of indigo on a recent 12-core 
computer could be computed in less 30 s of which over 90% of time was spent building the 
Hartree and exchange-correlation contributions (half each) to the interaction kernel. The cost of 
the iterative procedure was negligible. Computing spectra on the fly during molecular dynamics, 
e.g. solvation shifts in liquids, is thus a realistic prospect of the present methods. 
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Supplementary Information 
1 Linear scaling of the MG solver of the Poisson equation 

Figure [TT] shows the linear dependence of the sequential execution time of the MG solver on the 
size of the problem. 




0.1 1 

MESH POINTS (x 10" 6 ) 



Figure 11: Log-log plot of the CPU time (sequential, Nehalem Intel Xeon X5550, 2.66 GHz) of 
the multi-grid solution of the Poisson equation vs. system size. Line shows linear scaling. 



2 Use of the O design atom 




HI4 

Figure 12: Atomic numbering used to compare the Z-matrices. 
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The Z-matrices of the optimised geometries (DZP, LDA, standard Troullier-Martins pseudo- 
potentials) of the full bcnzylcarbamide and the truncated form are provided below. Atom num- 
bers differ from the main text; refer figure |T2| above. 
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Figure 13: Errors of internal coordinates of the truncated vs. the full molecule (numbering, see 
Z-matrices and figure [12] 

Figure [13] provides an overview of the errors in the different types of internal coordinates. 



with reference to the Z-matrices and the atomic numbering in figure 12 (not the same as in the 
main text). 



RR n° 8221 



26 



Coulaud 



Adaptive algorithm for fitting Lorentzians to the raw TDDFT 
spectrum 



Algorithm 1: Adaptive algorithm 



Initialization (p = 1): 
begin 

Start with a small number Ni of uniformly distributed frequency points 



J 



1 



step Fit 



and a not too small value of e = ei. 

for it = 1, iter Max do 

Compute P on the current uij. 
Build the new distribution: 
begin 

Define a set of uniformly spaced Ui = i/N p , i = 1, . . . N p in [0,1]. 
Find u>i such that J(u>i) <Ui < J(u>i+i) 
The new improved frequency is tOi 

Perform the least squares fit and determine the transitions fl^ and associated 
oscillator strengths 
Check the convergence: 



^approx(^i) 



max ( max 1 1 f2l 



, max II /? 



/r i ii)<M 



Reduce the regularization parameter if necessary. 



The different steps of the adaptive algorithm are illustrated in Algorithm [T] We start the 
algorithm with a uniform distribution of points. In |step Fit] of Algorithm [T] first we determine 
the number of Lorentzian involved in the spectra, using a peak detection algorithm, inspecting 
for adjacent regions of increasing and decreasing values. The adaptive algorithm converges when 
the maximum difference between two iterations of the At rans first (i.e. strongest) transition 
frequencies and the oscillator strengths are less than a given threshold \i. 

Tables [2] and [3] below provide additional data to complement the main text, on the amount 
of information recovered under different conditions in the spectrum of indigo in the range 
[0.02, 0.4] Ry . They show respectively, the influence of the regularisation parameter e at constant 
number of data points iVf req = 257 and at minimal -ZVf roq = 1 + ^(uj m ax — ^>m%n)- In both cases, 
the data points are distributed uniformly. The residual x 2 is defined by 



X 2 = \(I(c)- X 'i 



orentzian 



M)| 2 /|/H| 2 , 



where |.|2 is the discrete I 2 norm, XLorentian ^ s ^ ne Lorentian approximation to the experimental 
data I(u>). All other calculation conditions are the same as in the main text. 
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Table 2: Influence of e (at N hcq = 257) on the frequency and oscillator strength recovered for 
the two strongest transitions of indigo in the interval [0.02, 0.4] Ry . - means nothing found, * 
means wrong resultss (negative polarisability) . 
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Table 3: Influence of e, automatic choice for the number of frequencies ( - means nothing found.) 



RR n° 8221 




/ \ 

RESEARCH CENTRE 
BORDEAUX - SUD-OUEST 

200 avenue de la Vieille Tour 
33405 Talence Cedex 



Publisher 
Inria 

Domaine de Voluceau - Rocquencourt 
BP 105 - 78153 Le Chesnay Cedex 
inria.fr 

ISSN 0249-6399 



